Search results for "High Dimension"

showing 10 items of 19 documents

On the empirical spectral distribution for certain models related to sample covariance matrices with different correlations

2021

Given [Formula: see text], we study two classes of large random matrices of the form [Formula: see text] where for every [Formula: see text], [Formula: see text] are iid copies of a random variable [Formula: see text], [Formula: see text], [Formula: see text] are two (not necessarily independent) sets of independent random vectors having different covariance matrices and generating well concentrated bilinear forms. We consider two main asymptotic regimes as [Formula: see text]: a standard one, where [Formula: see text], and a slightly modified one, where [Formula: see text] and [Formula: see text] while [Formula: see text] for some [Formula: see text]. Assuming that vectors [Formula: see t…

Statistics and ProbabilityPhysicsAlgebra and Number TheorySpectral power distributionComputer Science::Information RetrievalProbability (math.PR)Astrophysics::Instrumentation and Methods for AstrophysicsBlock (permutation group theory)Marchenko–Pastur lawComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Bilinear form60F05 60B20 47N30Sample mean and sample covarianceCombinatoricsConvergence of random variablesFOS: Mathematicssample covariance matricesComputer Science::General LiteratureDiscrete Mathematics and CombinatoricsRandom matriceshigh dimensional statisticsStatistics Probability and UncertaintyRandom matrixRandom variableMathematics - ProbabilityRandom Matrices: Theory and Applications
researchProduct

An Extension of the DgLARS Method to High-Dimensional Relative Risk Regression Models

2020

In recent years, clinical studies, where patients are routinely screened for many genomic features, are becoming more common. The general aim of such studies is to find genomic signatures useful for treatment decisions and the development of new treatments. However, genomic data are typically noisy and high dimensional, not rarely outstripping the number of patients included in the study. For this reason, sparse estimators are usually used in the study of high-dimensional survival data. In this paper, we propose an extension of the differential geometric least angle regression method to high-dimensional relative risk regression models.

Clustering high-dimensional dataComputer sciencedgLARS Gene expression data High-dimensional data Relative risk regression models Sparsity · Survival analysisLeast-angle regressionRelative riskStatisticsEstimatorRegression analysisExtension (predicate logic)High dimensionalSettore SECS-S/01 - StatisticaSurvival analysis
researchProduct

From optimization to algorithmic differentiation: a graph detour

2021

This manuscript highlights the work of the author since he was nominated as "Chargé de Recherche" (research scientist) at Centre national de la recherche scientifique (CNRS) in 2015. In particular, the author shows a thematic and chronological evolution of his research interests:- The first part, following his post-doctoral work, is concerned with the development of new algorithms for non-smooth optimization.- The second part is the heart of his research in 2020. It is focused on the analysis of machine learning methods for graph (signal) processing.- Finally, the third and last part, oriented towards the future, is concerned with (automatic or not) differentiation of algorithms for learnin…

Signaux sur graphesOptimisation convexe[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]High dimensional dataGraph signalsStatistiques en grande dimensionAutomatic differentiation[MATH.MATH-OC] Mathematics [math]/Optimization and Control [math.OC][MATH.MATH-OC]Mathematics [math]/Optimization and Control [math.OC][STAT.ML] Statistics [stat]/Machine Learning [stat.ML]Convex optimizationDifférentiation automatique
researchProduct

Monitoring of chicken meat freshness by means of a colorimetric sensor array

2012

A new optoelectronic nose to monitor chicken meat ageing has been developed. It is based on 16 pigments prepared by the incorporation of different dyes (pH indicators, Lewis acids, hydrogenbonding derivatives, selective probes and natural dyes) into inorganic materials (UVM-7, silica and alumina). The colour changes of the sensor array were characteristic of chicken ageing in a modi¿ed packaging atmosphere (30% CO2¿70% N2). The chromogenic array data were processed with qualitative (PCA) and quantitative (PLS) tools. The PCA statistical analysis showed a high degree of dispersion, with nine dimensions required to explain 95% of variance. Despite this high dimensionality, a tridimensional re…

Quality ControlINGENIERIA DE LA CONSTRUCCIONMeatTime FactorsMaterials scienceAnalytical chemistryColorimetric sensor arrayBiochemistryAnalytical ChemistryQUIMICA ORGANICASensor arrayLinear regressionQUIMICA ANALITICAElectrochemistryAnimalsEnvironmental ChemistryStatistical analysisLeast-Squares AnalysisPROYECTOS DE INGENIERIASpectroscopyPrincipal Component AnalysisPigmentationChromogenicQUIMICA INORGANICAPrincipal component analysisColorimetryIndicators and ReagentsInorganic materialsHigh dimensionalityBiological systemChickensFood Analysis
researchProduct

Stochastic algorithms for robust statistics in high dimension

2016

This thesis focus on stochastic algorithms in high dimension as well as their application in robust statistics. In what follows, the expression high dimension may be used when the the size of the studied sample is large or when the variables we consider take values in high dimensional spaces (not necessarily finite). In order to analyze these kind of data, it can be interesting to consider algorithms which are fast, which do not need to store all the data, and which allow to update easily the estimates. In large sample of high dimensional data, outliers detection is often complicated. Nevertheless, these outliers, even if they are not many, can strongly disturb simple indicators like the me…

Stochastic AlgorithmsAlgorithmes StochastiquesAlgorithmes RécursifsRecursive AlgorithmsStatistique RobusteAlgorithmes de Gradient StochastiquesAveragingStochastic Gradient AlgorithmsMoyennisationGrande DimensionRobust StatisticsFunctional DataDonnées Fonctionnelles[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST]Geometric MedianHigh DimensionMédiane Géométrique
researchProduct

Sparse relative risk survival modelling

2016

Cancer survival is thought to closed linked to the genimic constitution of the tumour. Discovering such signatures will be useful in the diagnosis of the patient and may be used for treatment decisions and perhaps even the development of new treatments. However, genomic data are typically noisy and high-dimensional, often outstripping the number included in the study. Regularized survival models have been proposed to deal with such scenary. These methods typically induce sparsity by means of a coincidental match of the geometry of the convex likelihood and (near) non-convex regularizer.

gene expression datarelative risk regression modelsurvival analysisparsityhigh dimensional datadifferential geometrydglarsSettore SECS-S/01 - Statistica
researchProduct

A fast and recursive algorithm for clustering large datasets with k-medians

2012

Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which…

Statistics and ProbabilityClustering high-dimensional dataFOS: Computer and information sciencesMathematical optimizationhigh dimensional dataMachine Learning (stat.ML)02 engineering and technologyStochastic approximation01 natural sciencesStatistics - Computation010104 statistics & probabilityk-medoidsStatistics - Machine Learning[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]stochastic approximation0202 electrical engineering electronic engineering information engineeringComputational statisticsrecursive estimatorsAlmost surely[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST]0101 mathematicsCluster analysisComputation (stat.CO)Mathematicsaveragingk-medoidsRobbins MonroApplied MathematicsEstimator[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]stochastic gradient[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]MedoidComputational MathematicsComputational Theory and Mathematicsonline clustering020201 artificial intelligence & image processingpartitioning around medoidsAlgorithm
researchProduct

2014

Large data sets classification is widely used in many industrial applications. It is a challenging task to classify large data sets efficiently, accurately, and robustly, as large data sets always contain numerous instances with high dimensional feature space. In order to deal with this problem, in this paper we present an online Logdet divergence based metric learning (LDML) model by making use of the powerfulness of metric learning. We firstly generate a Mahalanobis matrix via learning the training data with LDML model. Meanwhile, we propose a compressed representation for high dimensional Mahalanobis matrix to reduce the computation complexity in each iteration. The final Mahalanobis mat…

Mahalanobis distanceTraining setApplied MathematicsFeature vectorHigh dimensionalcomputer.software_genreComputation complexityData miningBenchmark dataClassifier (UML)computerAlgorithmAnalysisMathematicsAbstract and Applied Analysis
researchProduct

LogDet divergence-based metric learning with triplet constraints and its applications.

2014

How to select and weigh features has always been a difficult problem in many image processing and pattern recognition applications. A data-dependent distance measure can address this problem to a certain extent, and therefore an accurate and efficient metric learning becomes necessary. In this paper, we propose a LogDet divergence-based metric learning with triplet constraints (LDMLT) approach, which can learn Mahalanobis distance metric accurately and efficiently. First of all, we demonstrate the good properties of triplet constraints and apply it in LogDet divergence-based metric learning model. Then, to deal with high-dimensional data, we apply a compressed representation method to learn…

AutomatedData InterpretationBiometryFeature extractionhigh dimensional datametric learningPattern RecognitionFacial recognition systemSensitivity and SpecificityMatrix decompositionPattern Recognition Automatedcompressed representationComputer-AssistedArtificial Intelligencecompressed representation; high dimensional data; LogDet divergence; metric learning; triplet constraint; Artificial Intelligence; Biometry; Data Interpretation Statistical; Face; Humans; Image Enhancement; Image Interpretation Computer-Assisted; Pattern Recognition Automated; Photography; Reproducibility of Results; Sensitivity and Specificity; Algorithms; Facial Expression; Software; Medicine (all); Computer Graphics and Computer-Aided DesignImage Interpretation Computer-AssistedPhotographyHumansDivergence (statistics)Image retrievalImage InterpretationMathematicsMahalanobis distancebusiness.industryLogDet divergenceMedicine (all)Reproducibility of ResultsPattern recognitionStatisticalImage EnhancementComputer Graphics and Computer-Aided DesignFacial ExpressionComputingMethodologies_PATTERNRECOGNITIONComputer Science::Computer Vision and Pattern RecognitionData Interpretation StatisticalFaceMetric (mathematics)Pattern recognition (psychology)Artificial intelligencetriplet constraintbusinessSoftwareAlgorithmsIEEE transactions on image processing : a publication of the IEEE Signal Processing Society
researchProduct

On Shimura subvarieties of the Prym locus

2018

We show that families of Pryms of abelian Galois covers of $\mathbb{P}^1$ in $A_{g-1}$ (resp. $A_g$) do not give rise to high dimensional Shimura subvareties.

Shimura varietyPure mathematicsAlgebra and Number TheoryMathematics::Number Theory010102 general mathematics010103 numerical & computational mathematicsHigh dimensionalPrym variety01 natural sciencesMathematics - Algebraic GeometryMathematics::Algebraic GeometryFOS: Mathematics0101 mathematicsAbelian groupLocus (mathematics)Algebraic Geometry (math.AG)Mathematics
researchProduct